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□ What you need to know about ODM 
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□ The BIG Question: Will it work on YOUR data? 
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□ Aim of Presentation 
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Definition 


Data Mining is knowiedge discovery in data, 

discovering non-triviai, hidden and previousiy 
unknown information from iarge coiiections of data 
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Aim of Presentation 


We've got so much data but we 
never do any data mining to 
make sense of what we have! 




Once we go iive with our 
dashboards we'il be abie to do 
some compiex data-miningl 


□ How many times have you heard these types of comments? 

□ Data mining is one of those topics that many people talk about but few 
actually understand its full potential and what is involved 
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Aim of Presentation 



□ This presentation aims to: 

> Describe Data Mining from a "Business" perspective 

> Demonstrate the value of integrating Oracle Data Mining and Business 
Intelligence: "Predictive Bl" 

> Explain how any Organization can use and benefit from data mining 
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□ About Oracle Data-Mining 
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Oracle Data Mining 
Why do we need it? 


□ When we analyse data on 
dashboards, humans are good 
at spotting simple trends 
involving a small number of 
dimensions 

> e.g. Age and Sex 

□ But humans find it difficult to: 

> Find patterns when you need 
to analyse large amounts of 
data over many different 
dimensions 

> Spot associations or affinities 
with probabilities for 
determining a level of 
confidence 
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■ Oracle Data Mining 
Why do we need it? 


□ 


When we analyse data on 
dashboards, humans are good 
at spotting m 
involving a 
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> Spot associations or affinities 
with probabilities for 
determining a level of 
confidence 
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■ Oracle Data Mining 
Why do we need it? 


□ When asked "Business" questions, employees tend to give subjective or 
biased responses 

□ Take this example: 


r ^ 

Why is this product 
not selling? 
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■ Oracle Data Mining 
Why do we need it? 


We need better 
demos! 




Sales Rep 


Too many bugs! 



Service Agent 


Not enough 
marketing spend! 



Marketeer 


DeUvery 
timesca!es 
are too tight! 





Developer 


We don't get 
enough training! 


Consultant 
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■ Oracle Data Mining 
Why do we need it? 


We need better 


demos! 


i 


Not enough 
*'^rl<eting spend! 


Sales Rep 


Which one is right? 



Marketeer 


Data mining wili heip you 
find out the unbiased truth! 

timescales enough 

are too tight! 


Consultant 


Developer 
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Oracle Data Mining 
What does it do? 


□ Oracle Data Mining (ODM) is a component of the Oracle Database 

> All you data stays in the same place 

□ With ODM, various algorithms can be applied to your historical data - 
these algorithms can build complex analytical models to explain your data 

□ The statistical models can then be used to help answer "Business" 
questions. For example: Is this angry customer likely to churn? 



► 
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I Oracle Data Mining 
I Everyone can benefit! 

□ Example 1: Human Resources 

> Build a statistical model on your Employee history: 

■ Employee Attributes (grade, office location, organisation etc) 

■ Salary increases 

■ Bonus payments 

■ Termination dates 

> During the next pay review cycle, use the statistical model to 
predict which employees will leave the company as a result of 
their pay/bonus awards 
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I Oracle Data Mining 
I Everyone can benefit! 

□ Example 2; Spend Classification 

> Build a statistical model on your Procurement history to 
categorise your spending behaviour 

> Use the information to optimise your sourcing decisions 

> For example: 

■ Company X realised they bought furniture from 25 different 
suppliers during the year 

■ Negotiating a deal with a single supplier for all office furniture 
saved £50K each year 
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Oracle Data Mining 
Everyone can benefit! 


□ Example 3: Fraudulent or Erroneous Expenses 

> Build a statistical model on your Expenses history-flagging 
which expenses were previously found to be either fraudulent 
or had contained errors 

> When employees submit new expenses, use the statistical 
model to predict which submissions may need a more detailed 
review prior to approval 
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Oracle Data Mining 
Everyone can benefit! 


□ Example 4: Customer Retention 

> Build a statistical model on your Customer Service history, 
flagging customers who have churned to a competitor 

> When customers contact the call centre, use the statistical 
model to predict the probability of the customer churning 
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Oracle Data Mining 
Everyone can benefit! 


□ Example 5: Marketing 

> Build a statistical model on the history of data generated 
through your loyalty card scheme 

■ Spending behaviour 

■ Range of products 

■ Combination of products bought together 

■ etc 

> For the next marketing campaign, use the statistical model to 
predict which customers are most likely to be interested in your 
latest ''Pizza + Coke" special deal 
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Oracle Data Mining 
Everyone can benefit! 


□ Example 6: Higher Education 

> Build a statistical model on the history of your student retention 

> When the next intake of students apply for their courses, use 
the statistical model to predict which students are most likely to 
withdraw early from their courses 
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□ What you need to know about ODM 
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What Do You Need To Know About ODM? 


□ .nothing! 

□ You just need to know an ODM expert! 
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■ ODM and OBI Demonstration 
How it happened 



Brendan, how can we help 
universities reduce the number 
of students who withdraw 
early from their courses? 


Me 


f 

^ What data do you 

1 have? 

9 



Brendan 


f 


r ^ 

Are you able to tell which 
students withdrew early? 

i J 

¥ 



Brendan 



We've got a list of all students 
going back 5 years 


Me 



Yes, we have a flag to state 
which ones withdrew early 
and which ones did not! 


Me 
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■ ODM and OBI Demonstration 
How it happened (continued) 



Brendan 


Can you give me a data set 
with 1 record per student 
with as many attributes as 
possible, including the 
'withdraw flag' 







9 


The data set should have 
50% withdrawals and 50% 
non-withdrawals 





Brendan 



Brendan 


/ will produce a statistical 
model that can help you 
predict which future 
students are likely to 
withdraw early 


Me 
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k ODM and OBI Demonstration 
Data Set handed to ODM Expert 


□ 1 record per student, each record with 28 attributes 

> 50% of records with WITHDRAW_FLAG=T 

> 50% of records with WITHDRAW FLAG='N' 
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ODM and OBI Demonstration 
Data Mining Model Produced 


□ The statistical model was produced by ODM expert 

□ Approximately 40% of the attributes were discarded for the analysis 


□ On completion, we built a simple database view that would return for 


each new student application: 

> Student Id 

> Withdraw Prediction (Y/N) 

> Prediction Probability 


1 ST PK 

H WrTHDRAW PREDICnON 

1 probabilfty] 

46447 

N 

94.495412844036697 

46452 

N 

66.866840731070498 

46599 

Y 

71.247739602169979 

46707 

N 

66.866840731070498 

46776 

N 

66.866840731070498 

46862 

Y 

78.65168539325843 


□ We could now use these predictions in our dashboards!! 


© Peak Indicators Limited 


26 













□ Demonstration of Predictive Bl 
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Id 

Utie 
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Status 

Country 
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Age 

Mode of 

Study 
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Org 

Academic 

Degree 

Academic 

Program 

Withdraw 

Flag 

Withdraw 

Prediction 

Result 

3732 

Miss 

Female 

Single 


47.00 

E-Learner 

SCIENCE 

BSC HONS 

UC8AG 

Yes 

Yes 
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Mrs 
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Miss 
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□ Further Considerations 
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Further Considerations 
Which ODM Models to Use? 


□ For this "Higher Education" example, 4 different models were attempted 
before the right one was found 


□ How do you know when you have found the right model? 

> With ODM you can partition the data set so that, for example, 80% can be 
used for building the models and the remaining 20% can be used for testing it 
afterwards 

> When ODM has finished analysing your data, it tests the resultant statistical 
model against the bit of data that was not processed 

> You can then keep refining your models until you have a satisfactory amount 
of "LIFT" (an indicator stating how good or accurate your model is) 
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Further Considerations 
Refining Models 


□ Do I have to do this all the time? 


Historical 

Data 



□ Answer: Yes and No 
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Further Considerations 
Data Quality 


□ Data quality is key 

□ Data cleansing could be a significant part of the whole data mining project 

□ The amount of "Unknowns" should be reduced to an absolute minimum 


Marital Status 



# Withdrawals Predicted 
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□ The BIG Question: 

Will it work on YOUR data? 
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The BIG Question! 


□ Will it work on YOUR data? 

> Yes! But. 

> ODM is ideally suited to a company with a mature (ish) Bl environment 
and good data quality 

> The only real way to prove the effectiveness of ODM is to perform a 
pilot on your own actual data 

> You should expect a pilot to take approx 3-6 weeks in duration 
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□ Questions? 
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